Installation Guide¶
Refer to the installation guide below. Some SDK components, including the RBLN Compiler (rebel-compiler
) and vllm-rbln
, require an RBLN Portal account for installation. If you need assistance, please contact us.
1. RBLN Driver¶
Note
RBLN Driver is primarily intended for on-premise servers.
If RBLN NPU devices are visible (ls /dev/rbln*
) on your server, you can skip the driver installation.
The RBLN Driver contains the Linux kernel driver and firmware, enabling the OS to recognize RBLN NPU devices. It is pre-installed on most cloud servers.
Key Features¶
- Kernel Driver & Firmware: Enables the OS to interface with the RBLN NPU.
- Package Formats: Available as Ubuntu (
.deb
) and RedHat (.rpm
) packages.
Installation¶
-
Ubuntu
-
RedHat
Additional Notes¶
- Root privileges are required for installation on on-premise servers.
- If you need
.deb
or.rpm
files, please contact us.
2. RBLN Compiler¶
The RBLN Compiler is the core component of the RBLN SDK, used to convert pre-trained models into an NPU-executable format. It also provides runtime environments (Python and C/C++) and profiling tools.
Note
A RBLN Portal account required for installation.
Key Features¶
- Compile API: Converts pre-trained models into RBLN NPU-executable formats.
- Runtime API:
- Python runtime: Installed via a
.whl
package. - C/C++ runtime: Requires GPG key registration and apt-based installation.
See C/C++ runtime installation for details.
- Python runtime: Installed via a
- Profiler Support: Offers performance analysis and optimization with the
RBLN Profiler
.
Installation¶
- Distributed as a
.whl
package. Install usingpip
:
3. HuggingFace Model Support (optimum-rbln
)¶
optimum-rbln integrates HuggingFace APIs, making it easy to compile pre-trained transformers
and diffusers
models to run on RBLN NPUs.
Key Features¶
- HuggingFace Integration: Seamlessly supports
transformers
anddiffusers
for RBLN-based inference. - Easy Deployment: Simplifies model loading and optimization for RBLN NPUs.
Installation¶
- Distributed as a
.whl
package:
4. RBLN Model Zoo¶
RBLN Model Zoo provides ready-to-use examples for compiling and running pre-trained models on RBLN NPUs. It serves as a reference for adapting custom models.
Key Features¶
- Pre-trained Models: Contains a diverse collection of scripts for various popular pre-trained models.
- Implementation Guides: Offers step-by-step instructions on how to develop model compilation and execution scenarios using RBLN NPUs.
Installation¶
- Hosted on GitHub. Clone the repository with:
5. Serving Frameworks Support¶
RBLN NPUs integrate with popular serving solutions, including vLLM, Nvidia Triton Inference Server, and TorchServe.
Key Features¶
- vLLM Support (
vllm-rbln
)- Custom vLLM solution for serving large language models (LLMs) on RBLN NPUs.
- Distributed as a
.whl
package. - Requires an RBLN Portal account for installation.
- Nvidia Triton Inference Server Support
- Refer to Nvidia Triton Inference Server Support for configuration details.
- TorchServe Support
- Refer to TorchServe Support for installation and usage instructions.
Installation¶
- vLLM (
vllm-rbln
) - Nvidia Triton Inference Server and TorchServe
- Visit Nvidia Triton Inference Server and TorchServe documentation pages for instructions and integration details.
Congratulations on setting up the RBLN SDK. You can now run PyTorch and TensorFlow models on RBLN NPUs.
Explore Tutorials for further understanding on how to use the RBLN SDK.